[Bugfix][plugin] fla crash on plugin by ILikeIneine · Pull Request #27322 · vllm-project/vllm

ILikeIneine · 2025-10-22T08:20:46Z

Purpose

There's some problem while supporting fla on plugin.
While importing the fla/ops/utils, it crashed on here.

Since in plugin the device might got their own value (here in vllm-metax is maca) and device_torch_lib still need to be their own. (here in vllm-metax is torch.cuda)

So I use is_cuda_alike and set default value to None on getattr to make some compatibilities for handling the corner cases. The semantics should be consistent with the original code.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request addresses a crash in Flash-Linear-Attention (FLA) operations when used with plugins. The changes in vllm/model_executor/layers/fla/ops/utils.py are well-reasoned and effective. By leveraging current_platform.is_cuda_alike(), the code now correctly identifies CUDA-compatible platforms (including plugins) and sets the device library appropriately. Adding a None default to getattr is a good defensive measure that prevents crashes on other platforms like CPU, making the utility more robust. The fix is correct and improves the overall stability of FLA operations in diverse environments.

NickLucche

I think this looks fine but I don't have the context on fla.
Perhaps @youkaichao can take a quick look at it.

mgoin

Looks simple enough to me. I believe the logic is kept the same for Nvidia and AMD, so nothing else changes for Intel or CPU

Signed-off-by: Hank <hcc.mayday@gmail.com>

related: vllm-project/vllm/pull/27322 Signed-off-by: Hank <hcc.mayday@gmail.com>

* support platform and remove kernel copy Signed-off-by: Hank <hcc.mayday@gmail.com> * update pre-commit Signed-off-by: Hank <hcc.mayday@gmail.com> * update version and requirements Signed-off-by: Hank <hcc.mayday@gmail.com> * update flashinfer Signed-off-by: Hank <hcc.mayday@gmail.com> * update build requirements Signed-off-by: Hank <hcc.mayday@gmail.com> * update attention backends Signed-off-by: Hank <hcc.mayday@gmail.com> * update patch Signed-off-by: Hank <hcc.mayday@gmail.com> * update quant_method Signed-off-by: Hank <hcc.mayday@gmail.com> * update fuse_moe (todo: fix mypy) Signed-off-by: Hank <hcc.mayday@gmail.com> * update `deepseek_v2.py`(todo: fix indexer kernel) Signed-off-by: Hank <hcc.mayday@gmail.com> * [feat] support bf16 cp_gather_indexer_k_cache kernel Signed-off-by: Xin Li <lixin1620@gmail.com> * [fix] fix type error in bf16_paged_mqa_logits Signed-off-by: leex404 <lixin1620@gmail.com> * [feat] add topk logits ops Signed-off-by: leex404 <lixin1620@gmail.com> * [fix] private memory size too large in `sample_recovered_tokens_kernel` (#115) * [fix] fix sample_recovered_tokens_kernel use too much private memory Signed-off-by: Xin Li <xin.li@metax-tech.com> * [fix] fix type error in bf16_paged_mqa_logits Signed-off-by: Xin Li <xin.li@metax-tech.com> * [chore] change file directory Signed-off-by: Xin Li <xin.li@metax-tech.com> --------- Signed-off-by: Xin Li <xin.li@metax-tech.com> Co-authored-by: Xin Li <xin.li@metax-tech.com> Signed-off-by: leex404 <lixin1620@gmail.com> * [fix] fix missing topk logits custom ops definition Signed-off-by: leex404 <lixin1620@gmail.com> * [fix] add custom gptq_shuffle ops Signed-off-by: leex404 <lixin1620@gmail.com> * [fix] fix compile error Signed-off-by: leex404 <lixin1620@gmail.com> * platform config update Signed-off-by: Hank <hcc.mayday@gmail.com> * update qwen2.5_vl model Signed-off-by: Hank <hcc.mayday@gmail.com> * [fix] fix torch not found maca device Signed-off-by: leex404 <lixin1620@gmail.com> * remove hotfixes patch for torch2.8 Signed-off-by: Hank <hcc.mayday@gmail.com> * remove needless patch related: vllm-project/vllm/pull/27322 Signed-off-by: Hank <hcc.mayday@gmail.com> * [feat] topk_softmax support renormalize and bf16 Signed-off-by: leex404 <lixin1620@gmail.com> * [fix] update fused_moe to fit v0.11.1 Signed-off-by: leex404 <lixin1620@gmail.com> * [fix] fix fused moe config log missing Signed-off-by: leex404 <lixin1620@gmail.com> * use flash_attn as vit attn backend on qwen_vl Signed-off-by: Hank <hcc.mayday@gmail.com> * update quant_conf registry Signed-off-by: Hank <hcc.mayday@gmail.com> * fix and apply latest pre-commit of v0.11.1 Signed-off-by: Hank <hcc.mayday@gmail.com> * [feat] Keep all AITER kernels in _aiter_ops Signed-off-by: leex404 <lixin1620@gmail.com> * fix pre-commit on type casting Signed-off-by: Hank <hcc.mayday@gmail.com> * [fix] fix DeepSeek import error Signed-off-by: leex404 <lixin1620@gmail.com> * [feat] update deepseek_v2 to fit v0.11.1 Signed-off-by: leex404 <lixin1620@gmail.com> --------- Signed-off-by: Hank <hcc.mayday@gmail.com> Signed-off-by: Xin Li <lixin1620@gmail.com> Signed-off-by: leex404 <lixin1620@gmail.com> Co-authored-by: Xin Li <xin.li@metax-tech.com> Co-authored-by: leex404 <lixin1620@gmail.com> Co-authored-by: leex404 <42941760+leex404@users.noreply.github.com>

ILikeIneine changed the title ~~fix fla crash on plugin~~ [Bugfix][plugin] fla crash on plugin Oct 22, 2025

gemini-code-assist bot reviewed Oct 22, 2025

View reviewed changes

NickLucche reviewed Oct 22, 2025

View reviewed changes

ILikeIneine force-pushed the fix-fla-crash-on-plugin branch from 873d0a9 to b483cc9 Compare October 24, 2025 02:23

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 1, 2025

mgoin approved these changes Nov 1, 2025

View reviewed changes

fix fla crash on plugin

e379c89

Signed-off-by: Hank <hcc.mayday@gmail.com>

ILikeIneine force-pushed the fix-fla-crash-on-plugin branch from 59e4598 to e379c89 Compare November 3, 2025 02:08

ILikeIneine mentioned this pull request Nov 3, 2025

[Bug]: fla crash on plugin MetaX-MACA/vLLM-metax#117

Closed

mgoin merged commit ccd3e55 into vllm-project:main Nov 3, 2025
47 checks passed

ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025

[Bugfix][plugin] fla crash on plugin (vllm-project#27322)

ad6d57a

ILikeIneine added a commit to MetaX-MACA/vLLM-metax that referenced this pull request Nov 11, 2025

remove needless patch

bbcc778

related: vllm-project/vllm/pull/27322 Signed-off-by: Hank <hcc.mayday@gmail.com>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Bugfix][plugin] fla crash on plugin (vllm-project#27322)

fda71ac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix][plugin] fla crash on plugin#27322

[Bugfix][plugin] fla crash on plugin#27322
mgoin merged 1 commit intovllm-project:mainfrom
ILikeIneine:fix-fla-crash-on-plugin

ILikeIneine commented Oct 22, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

NickLucche left a comment

Uh oh!

mgoin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

ILikeIneine commented Oct 22, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

NickLucche left a comment

Choose a reason for hiding this comment

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ILikeIneine commented Oct 22, 2025 •

edited by github-actions bot

Loading